Fast Incremental Indexing for Full-Text Information Retrieval

نویسندگان

  • Eric W. Brown
  • James P. Callan
  • W. Bruce Croft
چکیده

Full-text information retrieval systems have traditionally been designed for archival environments. They often provide little or no support for adding new documents to an existing document collection, requiring instead that the entire collection be re-indexed. Modern applications, such as information filtering, operate in dynamic environments that require frequent additions to document collections. We provide this ability using a traditional inverted file index built on top of a persistent object store. The data management facilities of the persistent object store are used to produce efficient incremental update of the inverted lists. We describe our system and present experimental results showing superior incremental indexing and competitive query processing performance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Journal of Emerging Trends in Computing and Information Sciences::Automatic Learning Context Tool for Effective Personal Document Indexing and Retrieval

Managing digital documents has become a time consuming process due to sheer scale. Most users manage their personal documents by creating logical hierarchical folder structures. This logical structure depends on the user’s assessment of the context of the document. Basic file structuring has not been changed for decades and hierarchical file structure remains the same. But there has been a surg...

متن کامل

The Study on Key Technology of Mongolian Full-Text Retrieval

With the development of the Mongolian corpus and website, an increasing number of people have focused their attention on the accurate, complete and fast retrieval of the information that they need. In this paper, such key technological issues in Mongolian full-text retrieval as character shape indexing, drawing the Mongolian verb stem and the automatic recognition of the Mongolian homographic w...

متن کامل

Why Keywording Matters

Most information retrieval systems nowadays use full-text searching because algorithms are fast and very good results can be achieved. However, the use of full text indexes has its limitations, especially in the multilingual context, and it is not a solution for further information access requirements such as the need for efficient document navigation, categorisation, abstracting and other mean...

متن کامل

Hierarchical Concept Indexing of Full - text Documents in the UMLS ® Information Sources Map 1 Lawrence W . Wright

Full-text documents are a vital and rapidly growing part of online biomedical information. A single large document can contain as much information as a small database, but normally lacks the tight structure and consistent indexing of a database. Retrieval systems will often miss highly relevant parts of a document if the document as a whole appears irrelevant. Access to full-text information is...

متن کامل

An experiment in software component retrieval

Our research centers around exploring methodologies for developing reusable software, and developing methods and tools for building inter-enterprise information systems with reusable components. In this paper, we focus on an experiment in which different component indexing and retrieval methods were tested. The results are surprising. Earlier work had often shown that controlled vocabulary inde...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1994